{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Models\n",
    "\n",
    "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for model clients and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
    "\n",
    "This section provides a quick overview of available model clients.\n",
    "For more details on how to use them directly, please refer to [Model Clients](../../core-user-guide/components/model-clients.ipynb) in the Core API documentation.\n",
    "\n",
    "```{note}\n",
    "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Log Model Calls\n",
    "\n",
    "AutoGen uses standard Python logging module to log events like model calls and responses.\n",
    "The logger name is {py:attr}`autogen_core.EVENT_LOGGER_NAME`, and the event type is `LLMCall`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "\n",
    "from autogen_core import EVENT_LOGGER_NAME\n",
    "\n",
    "logging.basicConfig(level=logging.WARNING)\n",
    "logger = logging.getLogger(EVENT_LOGGER_NAME)\n",
    "logger.addHandler(logging.StreamHandler())\n",
    "logger.setLevel(logging.INFO)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## OpenAI\n",
    "\n",
    "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "pip install \"autogen-ext[openai]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "openai_model_client = OpenAIChatCompletionClient(\n",
    "    model=\"gpt-4o-2024-08-06\",\n",
    "    # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To test the model client, you can use the following code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n"
     ]
    }
   ],
   "source": [
    "from autogen_core.models import UserMessage\n",
    "\n",
    "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(result)\n",
    "await openai_model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{note}\n",
    "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n",
    "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Azure OpenAI\n",
    "\n",
    "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "pip install \"autogen-ext[openai,azure]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
    "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n",
    "\n",
    "The following code snippet shows how to use AAD authentication.\n",
    "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.auth.azure import AzureTokenProvider\n",
    "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n",
    "from azure.identity import DefaultAzureCredential\n",
    "\n",
    "# Create the token provider\n",
    "token_provider = AzureTokenProvider(\n",
    "    DefaultAzureCredential(),\n",
    "    \"https://cognitiveservices.azure.com/.default\",\n",
    ")\n",
    "\n",
    "az_model_client = AzureOpenAIChatCompletionClient(\n",
    "    azure_deployment=\"{your-azure-deployment}\",\n",
    "    model=\"{model-name, such as gpt-4o}\",\n",
    "    api_version=\"2024-06-01\",\n",
    "    azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n",
    "    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.\n",
    "    # api_key=\"sk-...\", # For key-based authentication.\n",
    ")\n",
    "\n",
    "result = await az_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(result)\n",
    "await az_model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Azure AI Foundry\n",
    "\n",
    "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n",
    "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n",
    "\n",
    "You need to install the `azure` extra to use this client."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "pip install \"autogen-ext[azure]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.azure import AzureAIChatCompletionClient\n",
    "from azure.core.credentials import AzureKeyCredential\n",
    "\n",
    "client = AzureAIChatCompletionClient(\n",
    "    model=\"Phi-4\",\n",
    "    endpoint=\"https://models.github.ai/inference\",\n",
    "    # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n",
    "    # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n",
    "    credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n",
    "    model_info={\n",
    "        \"json_output\": False,\n",
    "        \"function_calling\": False,\n",
    "        \"vision\": False,\n",
    "        \"family\": \"unknown\",\n",
    "        \"structured_output\": False,\n",
    "    },\n",
    ")\n",
    "\n",
    "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(result)\n",
    "await client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Anthropic (experimental)\n",
    "\n",
    "To use the {py:class}`~autogen_ext.models.anthropic.AnthropicChatCompletionClient`, you need to install the `anthropic` extra. Underneath, it uses the `anthropic` python sdk to access the models.\n",
    "You will also need to obtain an [API key](https://console.anthropic.com) from Anthropic."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install -U \"autogen-ext[anthropic]\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finish_reason='stop' content=\"The capital of France is Paris. It's not only the political and administrative capital but also a major global center for art, fashion, gastronomy, and culture. Paris is known for landmarks such as the Eiffel Tower, the Louvre Museum, Notre-Dame Cathedral, and the Champs-Élysées.\" usage=RequestUsage(prompt_tokens=14, completion_tokens=73) cached=False logprobs=None thought=None\n"
     ]
    }
   ],
   "source": [
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.anthropic import AnthropicChatCompletionClient\n",
    "\n",
    "anthropic_client = AnthropicChatCompletionClient(model=\"claude-3-7-sonnet-20250219\")\n",
    "result = await anthropic_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(result)\n",
    "await anthropic_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Ollama (experimental)\n",
    "\n",
    "[Ollama](https://ollama.com/) is a local model server that can run models locally on your machine.\n",
    "\n",
    "```{note}\n",
    "Small local models are typically not as capable as larger models on the cloud.\n",
    "For some tasks they may not perform as well and the output may be suprising.\n",
    "```\n",
    "\n",
    "To use Ollama, install the `ollama` extension and use the {py:class}`~autogen_ext.models.ollama.OllamaChatCompletionClient`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "pip install -U \"autogen-ext[ollama]\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None thought=None\n"
     ]
    }
   ],
   "source": [
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.ollama import OllamaChatCompletionClient\n",
    "\n",
    "# Assuming your Ollama server is running locally on port 11434.\n",
    "ollama_model_client = OllamaChatCompletionClient(model=\"llama3.2\")\n",
    "\n",
    "response = await ollama_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(response)\n",
    "await ollama_model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gemini (experimental)\n",
    "\n",
    "Gemini currently offers [an OpenAI-compatible API (beta)](https://ai.google.dev/gemini-api/docs/openai).\n",
    "So you can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` with the Gemini API.\n",
    "\n",
    "```{note}\n",
    "While some model providers may offer OpenAI-compatible APIs, they may still have minor differences.\n",
    "For example, the `finish_reason` field may be different in the response.\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finish_reason='stop' content='Paris\\n' usage=RequestUsage(prompt_tokens=7, completion_tokens=2) cached=False logprobs=None thought=None\n"
     ]
    }
   ],
   "source": [
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "model_client = OpenAIChatCompletionClient(\n",
    "    model=\"gemini-1.5-flash-8b\",\n",
    "    # api_key=\"GEMINI_API_KEY\",\n",
    ")\n",
    "\n",
    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(response)\n",
    "await model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also, as Gemini adds new models, you may need to define the models capabilities via the model_info field. For example, to use `gemini-2.0-flash-lite` or a similar new model, you can use the following code:\n",
    "\n",
    "```python \n",
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "from autogen_core.models import ModelInfo\n",
    "\n",
    "model_client = OpenAIChatCompletionClient(\n",
    "    model=\"gemini-2.0-flash-lite\",\n",
    "    model_info=ModelInfo(vision=True, function_calling=True, json_output=True, family=\"unknown\", structured_output=True)\n",
    "    # api_key=\"GEMINI_API_KEY\",\n",
    ")\n",
    "\n",
    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
    "print(response)\n",
    "await model_client.close()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Llama API (experimental)\n",
    "\n",
    "[Llama API](https://llama.developer.meta.com?utm_source=partner-autogen&utm_medium=readme) is the Meta's first party API offering. It currently offers an [OpenAI compatible endpoint](https://llama.developer.meta.com/docs/features/compatibility).\n",
    "So you can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` with the Llama API.\n",
    "\n",
    "This endpoint fully supports the following OpenAI client library features:\n",
    "* Chat completions\n",
    "* Model selection\n",
    "* Temperature/sampling\n",
    "* Streaming\n",
    "* Image understanding\n",
    "* Structured output (JSON mode)\n",
    "* Function calling (tools)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "from autogen_core import Image\n",
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
    "\n",
    "# Text\n",
    "model_client = OpenAIChatCompletionClient(\n",
    "    model=\"Llama-4-Scout-17B-16E-Instruct-FP8\",\n",
    "    # api_key=\"LLAMA_API_KEY\"\n",
    ")\n",
    "\n",
    "response = await model_client.create([UserMessage(content=\"Write me a poem\", source=\"user\")])\n",
    "print(response)\n",
    "await model_client.close()\n",
    "\n",
    "# Image\n",
    "model_client = OpenAIChatCompletionClient(\n",
    "    model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n",
    "    # api_key=\"LLAMA_API_KEY\"\n",
    ")\n",
    "image = Image.from_file(Path(\"test.png\"))\n",
    "\n",
    "response = await model_client.create([UserMessage(content=[\"What is in this image\", image], source=\"user\")])\n",
    "print(response)\n",
    "await model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Semantic Kernel Adapter\n",
    "\n",
    "The {py:class}`~autogen_ext.models.semantic_kernel.SKChatCompletionAdapter`\n",
    "allows you to use Semantic kernel model clients as a\n",
    "{py:class}`~autogen_core.models.ChatCompletionClient` by adapting them to the required interface.\n",
    "\n",
    "You need to install the relevant provider extras to use this adapter. \n",
    "\n",
    "The list of extras that can be installed:\n",
    "\n",
    "- `semantic-kernel-anthropic`: Install this extra to use Anthropic models.\n",
    "- `semantic-kernel-google`: Install this extra to use Google Gemini models.\n",
    "- `semantic-kernel-ollama`: Install this extra to use Ollama models.\n",
    "- `semantic-kernel-mistralai`: Install this extra to use MistralAI models.\n",
    "- `semantic-kernel-aws`: Install this extra to use AWS models.\n",
    "- `semantic-kernel-hugging-face`: Install this extra to use Hugging Face models.\n",
    "\n",
    "For example, to use Anthropic models, you need to install `semantic-kernel-anthropic`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "shellscript"
    }
   },
   "outputs": [],
   "source": [
    "# pip install \"autogen-ext[semantic-kernel-anthropic]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To use this adapter, you need create a Semantic Kernel model client and pass it to the adapter.\n",
    "\n",
    "For example, to use the Anthropic model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finish_reason='stop' content='The capital of France is Paris. It is also the largest city in France and one of the most populous metropolitan areas in Europe.' usage=RequestUsage(prompt_tokens=0, completion_tokens=0) cached=False logprobs=None\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "from autogen_core.models import UserMessage\n",
    "from autogen_ext.models.semantic_kernel import SKChatCompletionAdapter\n",
    "from semantic_kernel import Kernel\n",
    "from semantic_kernel.connectors.ai.anthropic import AnthropicChatCompletion, AnthropicChatPromptExecutionSettings\n",
    "from semantic_kernel.memory.null_memory import NullMemory\n",
    "\n",
    "sk_client = AnthropicChatCompletion(\n",
    "    ai_model_id=\"claude-3-5-sonnet-20241022\",\n",
    "    api_key=os.environ[\"ANTHROPIC_API_KEY\"],\n",
    "    service_id=\"my-service-id\",  # Optional; for targeting specific services within Semantic Kernel\n",
    ")\n",
    "settings = AnthropicChatPromptExecutionSettings(\n",
    "    temperature=0.2,\n",
    ")\n",
    "\n",
    "anthropic_model_client = SKChatCompletionAdapter(\n",
    "    sk_client, kernel=Kernel(memory=NullMemory()), prompt_settings=settings\n",
    ")\n",
    "\n",
    "# Call the model directly.\n",
    "model_result = await anthropic_model_client.create(\n",
    "    messages=[UserMessage(content=\"What is the capital of France?\", source=\"User\")]\n",
    ")\n",
    "print(model_result)\n",
    "await anthropic_model_client.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Read more about the [Semantic Kernel Adapter](../../../reference/python/autogen_ext.models.semantic_kernel.rst)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
